© Krishi Sanskriti Publications http://www.krishisanskriti.org/jbaer.html # High Speed & Area Efficient Vedic Multiplier using Adiabatic Logic Dr. Malti Bansal<sup>#1</sup> and Diksha Ruhela<sup>#2</sup> #1Department of Electronics & Communication Engineering, #2M.TECH. (VLSI), Department of Electronics& Communication Engineering, Delhi Technological University, Delhi-110042, India E-mail: 1maltibansal@gmail.com, 2ruhela.diksha@gmail.com **Abstract**: While keeping in mind the demand of today's technology we review the work of Vedic multiplier using adiabatic logic based design. Vedic mathematics is the ancient Indian system of mathematics which mainly deals with Vedic mathematical formulae and their application to various branches of mathematics. Sri Bharati Krsna Tirtha constructed 16 sutras and 16 upa sutras after extensive research in the field of Atharva Veda. It has been found that Urdhva Tiryakbhayam is the most efficient among these 16 Vedas. High performance systems such as microprocessors, digital signal processors, filters, ALU etc. consists of so many components in which multiplier plays a vital role. Most of the DSP computations involve the use of multiply-accumulate operations, and therefore the design of fast and efficient multipliers is imperative. However, power, area and speed are usually conflicting constraints so that improving speed results mostly in larger areas or power. The power consumption of the Vedic multiplier is low as it generates all partial product and their sum in one step. The proposed review work can results in one of the better option in choosing a high speed and energy efficient multiplier out of other generated multipliers fabricating in different technology. Keywords: review, adiabatic, low power, Vedic multiplier ## 1. INTRODUCTION Vedic mathematics has proved to be the most robust technique for arithmetic operations. In contrast, conventional techniques for multiplication provide significant amount of delay in hardware implementation of n-bit multiplier. Moreover, the combinational delay of the design degrades the performance of the multiplier. Hardware-based multiplication mainly depends upon architecture selection in FPGA or ASIC. In this work we have put into effect a high speed Vedic multiplier using DTGAL (Dual transmission gate adiabatic logic). ## 2. VEDIC MATHEMATICS SUTRAS Vedic Mathematics deals with Sixteen Sutras [7]. These sutras are given below alphabetically with their brief meaning. All these sutras have vast study. Discussion of all of them is beyond the scope of this paper. Only one Sutra number 14 "Urdhva Tiryakbhyam" has been discussed. - Anurupye Shunyamanyat–If one is in ratio, the other is zero - 2. Chalana-Kalanabyham–Differences and Similarities - 3. Ekadhikina Purvena–By one more than the previous one - 4. Ekanyunena Purvena–By one less than the previous one - 5. Gunakasamuchyah—The factors of the sum is equal to the sum of the factors. - 6. Gunitasamuchyah–The product of the sum is equal to the sum of the products. - 7. Nikhilam Navatashcaramam Dashatah–All from 9 andthe last from 10 - 8. Paraavartya Yojayet–Transpose and adjust - 9. Puranapuranabyham–By the completion or non completion. - Sankalana-vyavakalanabhyam By addition and by Subtraction - 11. Shesanyankena Charamena–The remainders by the last Digit - 12. Shunyam Saamyasamuccaye—When the sum is the same that sum is zero - 13. Sopaantyadvayamantyam–The ultimate and twice the Penultimate - 14. Urdhva Tiryakbyham-Vertically and crosswise. - 15. Vyashtisamanstih-Part and Whole - 16. Yaavadunam-Whatever the extent to fit deficiency # 3. URDHAVA-TIRYAGBHYAM The multiplier is based on an algorithm Urdhva Tiryakbhyam (Vertical & Crosswise) of ancient Indian Vedic Mathematics. Urdhva Tiryakbhyam Sutra is a general multiplication formula applicable to all cases of multiplication. It literally means "Vertically and crosswise". It is based on a novel concept through which the generation of all partial products can be done with the concurrent addition of these partial products. Multiplication of 2 x 2 is shown in the example:- $26 \times 46$ Fig. 1: shows the UT multiplication For NXN multiplication unit, we require four N/2 bit multipliers, two N bit full adders, one half adder and N/2 bit full adder to add the sum and carry of half adder shown in Fig. 2.2 [1]. High speed of multiplier depends highly upon speed of adder units used. Let N=4 to find 4x4 multiplication we required four 2x2 bit multiplier, two 4 bit full adder, one half adder and 2 bit full adder to add the sum and carry of half adder. Let the two four digit input number be $a_0a_1a_2a_3$ and $b_0b_1b_2b_3$ on multiplication resulting into eight bit output $s_0$ to $s_7$ . First we need to calculate four partial products 2x2 i.e. $a_0a_1 \times b_0b_1$ , $a_0a_1 \times b_2b_3$ , $a_2a_3 \times b_0b_1$ , $a_2a_3 \times b_2b_3$ result will generates four bit output each. The results of these individual partial product fed to the 4 bit full adder and corresponding results to the half bit adder as shown in the block diagram in Fig. 2.2 below results in to an eight bits output $s_0$ to $s_7$ . In this approach, three 4-bit ripple carry adders are used and the combinational path delay is found to be 13.102 ns. Results are compared with Array and Booth Multiplier and it is observed that the execution time has been reduced for Vedic multiplier and thus proves to be better. The block diagram representation for 4 bit multiplication is shown below. Fig. 2.2: Block diagram for 4x4 Vedic multiplier using ripple carry adder [1] ### 4. ADIABATIC SWITCHING In the following section adiabatic switching analysed in detail: Adiabatic Logic does not abruptly switch from 0 to VDD (and vice versa), but a voltage ramp is used to charge and recover the energy from the output. The principle of operating an adiabatic gate is presented for a buffer gate in the Efficient Charge Recovery Logic (ECRL, [2]) in Fig. 3.1. The gate consists of two cross-coupled PMOS devices that are used to store the information. The logic function is constructed via two NMOS devices. Cascaded gates are operated by a fourphase power-clock signal. Input signals for the ECRL gate in Fig. 3.1are shifted by 90° with respect to the applied powerclock signal. Now for instance it is assumed, that input in is at logic one and the dual input in is at zero. Then the NMOS device N1 will conduct and connect out to ground, while N2 is disabled. As soon as the power-clock $\varphi$ ramped from 0 to VDDreaches the threshold voltage Vth,p of the PMOS device, P2 will be turned on. Thus the output signal out will follow the power-clock $\varphi$ . Now the gate voltage of device P1 is equal to the supply voltage, the gate-to-source voltage is zero, thus this device stays disabled. As soon as $\varphi$ reaches the maximum level VDD the input signals are ramped down, as the preceding gate recovers the energy at this time. The PMOS devices will take care of storing the information while both NMOS devices are disabled. Then the power-clock is descending from $V_{DD}$ to 0. While $\varphi$ is above $V_{th,p}$ charge from the output out is restored to $\varphi$ . A certain fraction of energy 1/2Cout $V^2_{TH,P}$ remains on the according output capacitance that is dissipated or reused in the next cycle, according to the succeeding input signals. Fig. 3.1: An ECRL buffer and an exemplary scheme of the signals in the gate in operation To calculate the energy consumed by charging a capacitance adiabatically, the equivalent circuit in Fig. 3.2for an adiabatic gate is used. Fig. 3.2 Equivalent circuit to determine the losses by adiabatically loading a capacitance It can be seen that ECRL has non-adiabatic loss on output nodes. As the ECRLuses two cross-coupled pMOS transistors for both pre-charge and energy recover, thus its energy loss per cycle is $$E_{ECRL} = (2R_P C_L / T)C_L V_{DD}^2 + C_L V_{TP},$$ (1) Where $C_L$ is the load capacitance, Rp is the turn-on resistance of pMOS transistors, T is the transition time of the power-clock, VDD is the peak voltage of power-clocks, and VTP is threshold voltage of pMOS transistors. Thus, from Equation (1), it can be inferred that the non-adiabatic energy loss is dependent on the load capacitance and independent of the frequency of operation. To overcome this disadvantage, a dual transmission gate adiabatic logic (DTGAL) has been presented in [2], as shown in Fig. 3.3. It is composed of two main parts: the logic evaluation circuit and the energy-recovery circuit. The logic evaluation circuit consists of transmission gates (Ni, PI) and (Nib, P2). The energy-recovery circuit consists of transmission gates (NI, PI) and (N2, P2). The cross-coupled transistors (N3 and N4)make the un-driven output node grounded. The power-clockclk charges the output (out or outb) through Ni and PI (or Nib and P2) under the control of the inputs (in and inb). The energy of output nodes is recovered to clk through NI and PI (or N2 and P2)under the control of the feedback signals (fin and finb), which are from the outputs of the next-stage buffer Fig. 3.3: DTGAL buffer, power-clocks, and its simulated waveform[3]. Cascaded DTGAL gates are driven by the same four-phasepower-clocks as ECRL circuits. For the final-stage DTGAL gate in a pipelined chain, an additional ECRL buffer (the lower buffer in Fig. 3.4) is used and its outputs (fin4 and finb4) control energy-recovery of the final-stage DTGAL gate. The simulated waveform (out4) is shown in Fig. 3.5. It can be seen that DTGAL hasn't non-adiabatic loss on output nodes. An adiabatic driving scheme for large load capacitances is shown in Fig. 3.4 [3]. The energy loss per cycle of the adiabatic driver can written as $$E_{\text{Driver}} = (2 \text{ RC}_{\text{L}}/\text{T}) C_{\text{L}} V_{\text{DD}}^2 + (2 R_{\text{DTGAL}} C_{\text{1}}/\text{T}) C_{\text{1}} V_{\text{DD}}^2 + (2 R_{\text{P}} C_{\text{2}}/\text{T}) C_{\text{2}} V_{\text{DD}}^2 + C_{\text{2}} V_{\text{TP}}^2,$$ (2) Where R is turn-on resistance of the transmission gates (Ni,NI, and PI) or (Nib, N2, and P2) of the DTGAL bufferdriving $C_L$ , $R_{DTGAL}$ is the turn-on resistance of thetransmission gates of the first-stage DTGAL buffer, RP is turn-on resistance of PMOS transistors of the ECRL buffer, $C_1$ and $C_2$ are the capacitance of the nodes out1 and fin2,respectively. In (1), the first term represents the energy loss of the DTGAL buffer driving $C_L$ , and the second term represents the energy loss of first-stage DTGAL buffer, and the third and fourth terms represent the energy loss of the ECRL buffer. Although the ECRL buffer has the non-adiabatic energy loss $C_2V_{TP}^2$ , this energy loss is small,because the capacitance $C_2$ , which mainly consists of gatecapacitance of the N1 (or N2) in the DTGAL buffer, is farsmaller than the load capacitance CL. Fig. 3.4: Adiabatic driving scheme for large load capacitance [4]. Fig. 3.5: Total energy dissipation per cycle of the adiabatic driver versus channel width of transistors (Ni, Nib, NI and N2). F is 100 MHz and VDD is 1 .8V [3]. For a large load capacitance, the energy loss of the DTGAL buffer driving $C_L$ can be reduced by increasing the channel width of the transmission gates, but this will increase $C_1$ and $C_2$ in (2). Therefore, the optimal size of the transmission gates can be chosen to minimize the total energy dissipation [2]. Fig. 3.5 shows simulation results of the total energy dissipation of the adiabatic driver for various channel widths of the transmission gate using 0.18µm TSMC process [3]. ## 5. CONCLUTION & FUTURE SCOPE On doing various comparatively study on Vedic multiplier with other multiplier indicates that Vedic multiplier takes the least power consumption compare to the other optimize multiplier circuits. Since greater number of adders is used for larger multipliers, the power savings of smalloperand sizes can be directly extrapolated to higher operand multiplier modules. Oneought to consider energy delay product (EDP), whichshould combine a measure of performance and, energy is more relevant a metric than the Power delayproduct (PDP). After studying the various multiplier technique the following result may be recapitulated: the PDP of the conventional Vedic multiplier is worse than the PDP of the adiabatic multiplier, due to negligible non-adiabatic loss in the DTGALlogic. Thought the adiabatic is comparatively slower, the PDP is very low due to very humble power consumption. It has been found in study that 8X8 (2x2) Vedic Adiabatic multiplier saves almost 16.5% (57.5%) of the 8x8(2X2) conventional Vedic multiplier. Current works are oriented towards improvising the specifications, the Vedic multiplier already offers. As an inspiration to future works, it is expected that there be more revival of Vedic mathematic Sutras (algorithm) which when implemented using adiabatic logic, can challenge conventional methods in terms of speed, accuracy and power consumption. ### REFERENCES - [1] Verma, P.: "Design of 4X4 bit Vedic Multiplier using EDA Tool,"International Journal of Computer Application (IJCA), Vol. 8, June, 2012. - [2] Jianping Hu, Xien Ye, Yinshui Xia, "A new dual transmission gateadiabatic logic and design of an 8 multiplied by 8-bit multiplier forlow-power DSP," IEEE 7TH Inter. Conf on Signal Processing, Beijing, China, pp. 559-562, 2004. - [3] Jianping Hu, Ling Wang, and Tiefeng Xu, "A Low-Power Adiabatic Multiplier Based on Modified Booth Algorithm," 2007 IEEE International Symposium on Integrated Circuits (ISIC-2007). - [4] Cheng, F., Unger, S. H., Theobald M.: "Self-Timed Carry-Lookahead Adders," IEEE Transactions on Computers, Vol. 49, NO. 7, July, 2000, pp. 659-672. - [5] J. P. Hu, X. T. Feng, J. J. Yu, and Y.S. Xia, "Low power dualtransmission gate adiabatic logic circuits and design of SRAM," The 47TH International Midwest Symposium on Circuits and Systems, Hiroshima, Japan, pp. 565-568, July 2004. - [6] Asati, A. Chandrasekhar, "An improved high speed fully pipelined 500 MHz 8×8 Baugh Wooley multiplierdesign using 0.6 \_m CMOS TSPC logic design style,"ICHS 2008, pp. 1-6, Dec. 2008. - [7] L. Ciminiera and A. Valenzano, "Low cost serial multipliers for highspeedspecialised processors," Computers and Digital Techniques, IEEProc. E, vol. 135.5, 1988, pp. 259-265 - [8] M. Ramalatha, K. Deena Dayalan, S. Deborah Priya, "High Speed Energy Efficient ALU Design using Vedic Multiplication Techniques," Advances in Computational Tools for Engineering Applications, 2009, IEEE Proc., pp 600-603 - [9] Beiu, "Microprocessor and a digital signal processor including adder and multiplier circuits employing logic gates having discrete and weighted inputs", United States Patent, 6,516,331, February 4, 2003 - [10] IEEE paper titled "Novel Transistor Level Realisation of Ultralow power High Speed Adiabatic Vedic Multiplier", authored by M.Chanda, S.Banerjee, D.Saha, S.Jain.. - [11] B. Jagadguru Swami Sri Bharath, KrsnaTirathji, "Vedic Mathematics or Sixteen Simple Sutras From The Vedas", Motilal Banarsidas , Varanasi (India), 1986. - [12] H. Thapliyal, "VLSI implementation of RSA encryption system using ancient Indian Vedic mathematics," proc. of VLSI circuit and system, vol. 5837, pp. 888-892, 2005. - [13] Y.Takahashi, T.Sekine, and M. Yokoyama, "Twophase clocked CMOS adiabatic logic," in Proc. IEEE Asia pacifiic Conf. Circuits and Systems, Macao, China, Nov. 30-Dec. 3, 2008. - [14] M. A. Erle and M. J. Schulte, "Decimal multiplication via carry-saveaddition," *Proc. 14th IEEE International Conf. Application Specific Systems*, pp. 348–358, June 2003. - [15] G. Sutter, E. Todoro vich, G. Bioul, M. Vazquez, and J.-P. Deschamps, "FPGA implementations of BCD multipliers," Proc.IEEE International Conf. Reconfigurable Computing and FPGAs, pp.36–41, Dec. 2009. - [16] Vestias, M.P., Neto and H.C. "Iterative Decimal Multiplication Using Binary Arithmetic," 7th Southern Conference Programmable Logic,pp. 257–262, April 2011. - [17] M.E Paramasivam and R.S Sabeenian, "An efficient bit reduction binary multiplication algorithm using Vedic methods," IEEE 2nd International Advance Computing Conference, Patiala, India, pp. 25, 19-20 Feb. 2010. - [18] Saokar, S. S., R.M., and Siddamal, S.: "High Speed Signed Multiplier forDigital Signal. Processing Applications," Proc. IEEE International Conferenceon Signal Processing, Computing and Control (ISPCC), Waknaghat Solan, 15-17 March 2012, pp. 1-6. - [19] Kumar, A. and Raman, A.: "Low Power ALU Design by AncientMathematics," presented at IEEE ICAAE, Singapore, Feb. 2010, pp. 862-865. - [20] Hanumantharaju , M.C. , Jayalaxmi, H., Renuka R.K. ,and Ravishankar,M.: "A High Speed Block Convolution Using Ancient Indian VedicMathematics," IEEE International Conference on ComputationalIntelligence and Multimedia Applications , Sivakasi, Tamil Nadu ,13-15 Dec 2007, pp.169-173. - [21] Prakash, A.R., Kirubaveni. S.: "Performance evaluation of FFT processorusing conventional and Vedic algorithm," IEEE International Conference on Emerging Trends in Computing, Communication and Nanotechnology (ICECCN), Tirunelveli, March 2013, pp. 89-94. - [22] Saha, P., Banerjee, A., Dandapat, A., and Bhattacharyya, P.: "ASICdesign of a high speed low power circuit for factorial calculation using ancientVedic mathematics," Elsevier Microelectronics Journal, 2011, vol. 42: 1343-1352.